Timed Annotations - Enhancing MUC7 Metadata by the Time It Takes to Annotate Named Entities

نویسندگان

  • Katrin Tomanek
  • Udo Hahn
چکیده

We report on the re-annotation of selected types of named entities from the MUC7 corpus where our focus lies on recording the time it takes to annotate these entities given two basic annotation units – sentences vs. complex noun phrases. Such information may be helpful to lay the empirical foundations for the development of cost measures for annotation processes based on the investment in time for decision-making per entity mention.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Metadata Collection for Personal Multimedia Repositories Using GWAP

With the increasing number of personal albums and photos in them, users have more and more problems with their organization [4]. This is due to lack of descriptive data, however, their amount only depends on the abilities and will of the image owners. Tools for metadata creation are available – the main problem is with the user motivation: because tagging and annotating of photos is usually a b...

متن کامل

Bootstraping Information Extraction Using Regularity of Web Pages

To annotate web documents with metadata automatically, we must prepare a database that stores annotation targets and these metadata. In the case of location information, we need a database that stores many named entities (NEs) and their location information (i.e., telephone number and address). In this paper, we present a bootstrapping approach to extract triples. We describe our extraction met...

متن کامل

An Extended Study of Content and Crowdsourcing-related Performance Factors in Named Entity Annotation

Hybrid annotation techniques have emerged as a promising approach to carry out named entity recognition on noisy microposts. In this paper, we identify a set of content and crowdsourcing-related features (number and type of entities in a post, average length and sentiment of tweets, composition of skipped tweets, average time spent to complete the tasks, and interaction with the user interface)...

متن کامل

A Bootstrapping Approach for Geographic Named Entity Annotation

Geographic named entities can be classified into many subtypes that are useful for applications such as information extraction and question answering. In this paper, we present a bootstrapping algorithm for the task of geographic named entity annotation. In the initial stage, we annotate a raw corpus using seeds. From the initial annotation, boundary patterns are learned and applied to the corp...

متن کامل

Annotation Time Stamps - Temporal Metadata from the Linguistic Annotation Process

Abstract We describe the re-annotation of selected types of named entities (persons, organizations, locations) from the MUC7 corpus. The focus of this annotation initiative is on recording the time needed for the linguistic process of named entity annotation. Annotation times are measured on two basic annotation units – sentences vs. complex noun phrases. We gathered evidence that decision time...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009